New Parallel Prefix Algorithms
نویسندگان
چکیده
New families of computation-efficient parallel prefix algorithms for message-passing multicomputers are presented. The first family improves the communication time of a previous family of parallel prefix algorithms; both use only half-duplex communications. Two other families adopt collective communication operations to reduce the communication times of the former two, respectively. These families each provide the flexibility of either fewer computation time steps or fewer communication time steps to achieve the minimal running time depending on the ratio of the time required by a communication step to the time required by a computation step. Key-Words: Collective communication, Computation-efficient parallel prefix, Half-duplex, Message-passing multicomputer, Parallel algorithm, Precondition
منابع مشابه
New Families of Computation-Efficient Parallel Prefix Algorithms
New families of computation-efficient parallel prefix algorithms for message-passing multicomputers are presented. The first family improves the communication time of a previous family of parallel prefix algorithms; both use only half-duplex communications. Two other families adopt collective communication operations to reduce the communication times of the former two, respectively. The precond...
متن کاملA New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure
The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...
متن کاملParallel Prefix Scan with Compute Unified Device Architecture (cuda)
Parallel prefix scan, also known as parallel prefix sum, is a building block for many parallel algorithms including polynomial evaluation, sorting and building data structures. This paper introduces prefix scan and also describes a step-bystep procedure to implement prefix scan efficiently with Compute Unified Device Architecture (CUDA). This paper starts with a basic naive algorithm and procee...
متن کاملComputation-Efficient Parallel Prefix
We are interested in solving the prefix problem of n inputs using p < n processors on completely connected distributed-memory multicomputers (CCDMMs). This paper improves a previous work in three respects. First, the communication time of the previous algorithm is reduced significantly. Second, we show that p(p + 1)/2 < n is required for the new algorithm and the original one to be applicable. ...
متن کامل: Parallel Algorithms for Bucket Sorting and the Data Dependent Prefix Problem
The data dependent prefix problem is to compute all the n initial products x1⃝x2⃝...⃝xk, 1 ≤ k ≤ n, where the order is specified by a linked list. A parallel algorithm for the data dependent prefix problem is presented. This algorithm has time complexity O( n p + log n log n p ) using p processors on the exclusive-read exclusive-write computation model. A bucket sorting algorithm is also develo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009